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Summary. In this paper, we present an overview of the recent developments of 
functional quantization of stochastic processes, with an emphasis on the quadratic 
case. Functional quantization is a way to approximate a process, viewed as a Hilbert- 
valued random variable, using a nearest neighbour projection on a finite codebook. 
A special emphasis is made on the computational aspects and the numerical appli- 
cations, in particular the pricing of some path-dependent European options. 



1 Introduction 

Functional quantization is a way to discretize the path space of a stochastic 
process. It has been extensively investigated since the early 2000's by several 
authors (see among others [55], [3T], [T^], [S], [32], etc). It first appeared as 
a natural extension of the Optimal Vector Quantization theory of (finite- 
dimensional) random vectors which finds its origin in the early 1950's for 
signal processing (see [T^ or [TT]). 

Let us consider a Hilbertian setting. One considers a random vector X de- 
fined on a probability space {Q, A, P) taking its values in a separable Hilbert 
space {H,{.\.)^) (equipped with its natural Borel cr-algebra) and satisfying 
E|Xp < +O0. When H is an Euclidean space (M''), one speaks about Vector 
Quantization. When H is an infinite dimensional space like := L^{[0, T], dt) 
(endowed with the usual Hilbertian norm 1/1^2 := (J^ f'^(t)dt)^) one speaks 
of functional quantization (denoted from now on). A (bi- measurable) 
stochastic process {Xt)t(=[o,T] defined on (i?,^, P) satisfying |Ar((jj)|i2 < +oo 
¥{du)-a.s. can always be seen, once possibly modified on a P-negligible set, 
as an L^-valued random variable. Although we will focus on the Hilbertian 
framework, other choices are possible for H, in particular some more general 
Banach settings like ^^([0, T],dt) or C([0, T],R) spaces. 

This paper is organized as follows: in Sections [2] we introduce quadratic 
quantization in a Hilbertian setting. In Section [S] we focus on optimal quanti- 
zation, including some extensions to non quadratic quantization. Section |4] is 
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devoted to some quantized cubature formulae. Section [5] provides some clas- 
sical background on the quantization rate in finite dimension. Section [7] deals 
with functional quantizations of Gaussian processes, like the Brownian mo- 
tion, with a special emphasis on the numerical aspects. We present here what 
is, to our guess, the first large scale numerical optimization of the quadratic 
quantization of the Brownian motion. We compare it to the optimal product 
quantization, formerly investigated in [44j . In section, we propose a construc- 
tive approach to the functional quantization of scalar or multidimensional 
diffusions (in the Stratanovich sense). In Section [21 we show how to use func- 
tional quantization to price path-dependent options like Asian options (in a 
heston stochastic volatility model). We conclude by some recent results show- 
ing how to derive universal (often optimal) functional quantization rate from 
time regularity of a process in Section [10] and by a few clues in Section [TT] 
about the specific methods that produce some lower bounds (this important 
subject as many others like the connections with small deviation theory is not 
treated in this numerically oriented overview. As concerns statistical applica- 
tions of functional quantization we refer to [53|, I54j . 

Notations. • a„ « 6„ means a„ — 0{b„) and 6„ = 0(a„); a„ ~ 6„ means 
an = bn + o(a„). 

• liX : {n,A,¥) (iJ, |.U) (Hilbert space), then ||X||, = {E\X\l)i . 

• [x\ denotes the integral part of the real x. 



2 What is quadratic functional quantization? 

Let {H, ( .|. )^) denote a separable Hilbert space. Let X G i^(F) i.e. a random 
vector X : (/?, A, P) i — > H [H is endowed with its Borel cr-algebra) such that 
E 1^1^ < +00. An N -quantizer (or N-codebook) is defined as a subset 

r -.^ {xi, . . .,Xj^} C H 

with cardr' — N. In numerical applications, F is also called grid. Then, one 
can quantize (or simply discretize) X by q{X) where q : H i-^ F is a Borel 
function. It is straightforward that 

V^e n, \X{cu) - q{X{cj))\„ > d{X{u;),F) = min \X{u;) ~ 

l<i<N 

so that the best pointwise approximation of X is provided by considering for 
q a nearest neighbour projection on F, denoted Proj^. Such a projection is 
in one-to-one correspondence with the Voronoi partitions (or diagrams) of H 
induced by F i.e. the Borel partitions of H satisfying 

a{F) cUeH:\^- x,\„ = ^min^ |C - x,\„ \ = C,{F), i = l,...,N, 
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where Ci{r) denotes the closure of Ci{r) in H (this heavily uses the Hilbert 
structure). Then 

N 

Proj,(e) :=5]x,lc.(r)(0 

i=l 

is a nearest neighbour projection on F . These projections only differ on the 
boundaries of the Voronoi cells Ci{r), i ~ l^. ..,N. All Voronoi partitions 
have the same boundary contained in the union of the median hyperplanes 
defined by the pairs {xi,Xj), i ^ j- Figure [2] represents the Voronoi dia- 
gram defined by a (random) 10-tuple in . Then, one defines a Voronoi 



Fig. 1. A 2-dimensional lO-quanttzer F — {xi, . . . , xio} and its Voronoi diagram. 



N -quantization of X by setting for every J?, 

N 

X^{u;) Proj^(X(a;)) ^J2'=^^c.ir){X{cj)). 

1=1 

One clearly has, still for every ojE fl, that 

\X{u;) - X^{u;)\„ - dist„ (X(^), T) = min \X{u;) - x,\ 

l<t<N 

The mean (quadratic) quantization error is then defined by 



e{r,X,H) = \\X-X^\\, i^^mmjX ~ x^llj. (1) 

The distribution of X^ as a random vector is given by the A^-tuple (F{X E 
Ci(F)))i<i<7V of the Voronoi cells. This distribution clearly depends on the 
choice of the Voronoi partition as emphasized by the following elementary 
situation: if = M, the distribution of X is given by = |(<5o + <5i/2 + <5i)j 
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TV = 2 and r {0, 1} since 1/2 £ dCo{r)ndCi{r). However, if weights 
no hyperplane, the distribution of depends only on F. 

As concerns terminology, Vector Quantization is concerned with the finite 
dimensional case ~ when dim_ff < +oo - and is a rather old story, going 
back to the early 1950's when it was designed in the field of signal processing 
and then mainly developed in the community of Information Theory. The 
term functional quantization, probably introduced in [411 129j , deals with the 
infinite dimensional case including the more general Banach-valued setting. 
The term "functional" comes from the fact that a typical infinite dimensional 
Hilbert space is the function space H — L^. Then, any (bi-measurable) process 
X : {[0,T]x Q, Bor{[0, T])'^A) (M, Bor{R)) can be seen as a random vector 
taking values in the set of Borel functions on [0,T]. Furthermore, {{t,uj) i-^ 
Xt{uj)) £ L^{dt (g) dF) if and only if {uj ^ X,{uj)) e L%{F) since 

[ Xf{Lj)dtF{dLj) ^ [ F{dLj) [ X^{uj) dt = E\X,\l2 . 
J[Q,T]xn Jo Jo ^ 



3 Optimal (quadratic) quantization 

At this stage we are lead to wonder whether it is possible to design some 
optimally fitted grids to a given distribution i.e. which induce the lowest 
possible mean quantization error among all grids of size at most N. This 
amounts to the following optimization problem 

e^iX,H):^ inf e{r,X,H). (2) 

rcH,caTd{r)<N 



It is convenient at this stage to make a correspondence between quantizers 
of size at most A'" and A'-tuples of H^: to any A^-tuple x := {xi, . . . ,xn) 
corresponds a quantizer F := r{x) = {xi, i = 1, . . . , A^} (of size at most A^). 
One introduces the quadratic distortion, denoted , defined on as a 
(symmetric) function by 




Note that, combining ^ and the definition of the distortion, shows that 

(xi, . . . , X J = E f min \X - x^A ^ E (d(A, ^2;))^) = ||X - A^^)] 
SO that, 




e„(X,i/)= inf Jd^{x,,...,x,). 

The following proposition shows the existence of an optimal A''-tuple 
(z such that e^{X,H) = y^D^(a;(^'*)). The corresponding op- 
timal quantizer at level N is denoted /^(^'*) I^(a;*^^'*^). In finite dimension 
we refer to [49] (1982) and in infinite dimension to [7] (1988) and [48] (1990); 
one may also see [31] , [T7] and [29] . For recent developments on existence and 
pathwise regularity of optimal quantizer see [20] . 

Proposition 1. (a) The function is lower semi- continuous for the prod- 
uct weak topology on . 

(b) The function reaches a minimum at a N -tuple x^^^*' (so that 
is an optimal quantizer at level N). 

- //card(supp(Px)) > N , the quantizer has full size N (i.e. card(P'-"'^'*-') — 
N) and e^iX,H) <e^_jX,H). 

- //card(supp(Px)) < N, e^{X,H) = 0. 

Furthermore lime„(X, i/) = 0. 

(c) Any optimal (Voronoi) quantization at level N , X^'' ' ' satisfies 

X^<"'"=E(X|a(X^'"-*')) (3) 
where a{X^'' ') denotes the a-algebra generated by 

(d) Any optimal (quadratic) quantization at level N is a best least square (i.e. 
L'^(P)) approximation of X among all H -valued random variables taking at 
most N values: 

e„(X,iJ) = = min{||X-r||,, Y : (^[2, A) H, card(y(r2)) < N}. 



6 Gilles Pages 

Proof (sketch of): (a) The claim follows from the l.s.c. of ^ i-^ |^|^ for the 
weak topology and Fatou's Lemma. 

(6) One proceeds by induction on iV. If iV = 1, the optimal 1-quantizer is 
^.(JV,*) ^ ande2{X,H) = \\X -EX\\^. 

Assume now that an optimal quantizer x^^'*^ = {x[^'*\ . . . ,x^^'*'>) does 
exist at level A''. 

- If card(supp(P)) < N, then the + 1-tuple (a;(^'*), a;(f ■*)) (among 
other possibilities) is also optimal at level A'+l and e„^i {X, H) = e„ (AT, H) = 
0. 

- Otherwise, card(supp(P)) > A + 1, hence x^^'*-' has pairwise distinct 
components and there exists ^m+i G supp(Px) \ {x^i^'*\ i = 1, . . . , A^} ^ 0. 

Then, with obvious notations, 

^^,.((^^^'*^e.+J)<^^(^(^'*)). 

Then, the set F^+i ~ {are H^+^ \ D^^^{x) < -D^+,((x(^'*),^,+J)} is non 
empty, weakly closed since D^^^ is l.s.c. Furthermore, it is bounded in 

Otherwise there would exist a sequence G H^"^^ such that |a;(m),i^|„ = 
maxi |a;(TO)^j|^ +oo as m oo. Then, by Fatou's Lemma, one checks that 

liminf I)^^^(X(„)) > i5^(x(^'*)) > <,((x(^'*),e.,J). 

Consequently .Fjv+i is weakly compact and the minimum of D^^^ on Fjv+i 
is clearly its minimum over the whole space H^^^. In particular 

e^^AX^H) < i)^^,((xW*),C,^J) < e,(X,F). 

If card(supp(P)) = A^ + 1, set a;(^+i'*) = supp(P) (as sets) so that t X = 
^p(N+i,,) ^Yiich implies e„_^^(A, iJ) — 0. 

To establish that e^{X,H) goes to 0, one considers an everywhere dense 
sequence {zk)k>i in the separable space H. Then, . . . , z^-}, A(ti))) goes 

to as A' ^ oo for every a; G i7. Furthermore, d({zi, . . . , 2;„},X(w))^ < 
\X{lo) — 2;i|^ G _L^(P). One concludes by the Lebesgue dominated convergence 
Theorem that [z\,..., zn) goes to as A' — > oo. 

(c) and (d) Temporarily set X* := A^* ' for convenience. Let Y : {fl,A) 
H he & random vector taking at most A^ values. Set F := Y{{2). Since X^ is 
a Voronoi quantization of X induced by F, 

\X-X^\„=d{X,F)<\X-Y\„ 

so that 
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On the other hand, the optimahty of imphes 

\\x-x*\\, < \\x-x^\u. 

Consequently 

\\X-X*\\, < min{|lX-F|l,, Y : {n,A)~^H, card(F(r2)) < A^} . 

The inequahty holds as an equality since X* takes at most N values. Further- 
more, considering random vectors of the form Y — g{X) (which take at most 
as many values as the size 

of 7-(JV,*)) shows, going back to the very definition 
of conditional expectation, that X* — K(X \ X*) P-a.s. ^ 

Item (c) introduces a very important notion in (quadratic) quantization. 

Definition 1. A quantizer F <Z H is stationary (or self-consistent j if (there 
is a nearest neighbour projection such that X^ = Proj^(X) satisfying) 

X^ =¥.(x\ X^) . (4) 



Note in particular that any stationary quantization satisfies EX = EX . 

As shown by Proposition [TJc) any quadratic optimal quantizer at level 
N is stationary. Usually, at least when d > 2, there are other stationary 
quantizers: indeed, the distortion function D-^ is | . |^-differentiable at iV- 
quantizers x € with pairwise distinct components and 



l<i<N 



l<i<N 



hence, any critical points of D-^ is a stationary quantizer. 

Remarks and comments. • In fact (see Theorem 4.2, p. 38, [H]), the 

Voronoi partitions of /^(^^*) always have a Pjf -negligible boundary so that 
([4]) holds for any Voronoi diagram induced by F. 

• The problem of the uniqueness of optimal quantizer (viewed as a set) is 
not mentioned in the above proposition. In higher dimension, this essentially 
never occurs. In one dimension, uniqueness of the optimal iV-quantizer was 
first established in ^14j with strictly log-concave density function. This was 
successively extended in [23] and [55] and lead to the following criterion (for 
more general "loss" functions than the square function): 

If the distribution of X is absolutely continuous with a log-concave density 
function, then, for every > 1, there exists only one stationary quantizer of 
size N , which turns out to be the optimal quantizer at level N. 



8 Gilles Pages 



More recently, a more geometric approach to uniqueness based on the 
Mountain Pass Lemma first developed in and then generalized in [5]) 
provided a slight extension of the above criterion (in terms of loss functions). 

This log-concavity assumption is satisfied by many families of probability 
distributions like the uniform distribution on compact intervals, the normal 
distributions, the gamma distributions. There are examples of distributions 
with a non log-concave density function having a unique optimal quantizer 
for every TV > 1 (see e.g. the Pareto distribution in 16J). On the other hand 
simple examples of scalar distributions having multiple optimal quantizers at 
a given level can be found in [17 . 

• A stationary quantizer can be sub-optimal. This will be emphasized in Sec- 
tion [7] for the Brownian motion (but it is also true for finite dimensional 
Gaussian random vectors) where some families of sub-optimal quantizers - the 
product quantizers designed from the Karhunen-Love basis - are stationary 
quantizers. 

• For the uniform distribution over an interval [a, 6], there is a closed form 
for the optimal quantizer at level N given 

1, . . . , N}. This A^-quantizer is optimal not only in the quadratic case but also 
for any L''-quantization (see a definition further on) . In general there is no such 
closed form, either in 1 or higher dimension. However, in [16] some semi-closed 
forms are obtained for several families of (scalar) distributions including the 
exponential and the Pareto distributions: all the optimal quantizers can be 
expressed using a single underlying sequence {ak)k>i defined by an induction 
ak+i = F{ak). 

• In one dimension, as soon as the optimal quantizer at level N is unique (as 
a set or as an A^-tuple with increasing components), it is generally possible 
to compute it as the solution of the stationarity equation ^ either by a zero 
search (Newton-Raphson gradient descent) or a fixed point (like the specific 
Lloyd I procedure, see P?^) procedure. 

• In higher dimension, deterministic optimization methods become intractable 
and one uses stochastic procedures to compute optimal quantizers. The main 
topic of this paper being functional quantization, we postponed the short 
overview on these aspects to Section [71 devoted to the optimal quantization 
of the Brownian motion. But it is to be noticed that all efficient optimization 
methods rely on the so-called splitting method which increases progressively 
the quantization level N. This method is directly inspired by the induction 
developed in the proof of claim (6) of Proposition [T] since one designs the 
starting value of the optimization procedure at size + 1 by "merging" the 
optimized A^-quantizer obtained at level A^ with one further point of M'', 
usually randomly sampled with respect to an appropriate distribution (see |43j 
for a discussion). 

• As concerns functional quantization, e.g. H — L^, there is a close connection 
between the regularity of optimal (or even stationary) quantizers and that of 
1 1-^ Xt form [0, T] into L^(P). Furthermore, as concerns optimal quantizers of 
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Gaussian processes, one shows (see [25]) that they belong to the reproducing 
space of their covariance operator, e.g. to the Cameron-Martin space — 
{Jp hgds, h(z L^} when X = W. Other properties of optimal quantization of 
Gaussian processes are established in PU], 

Extensions to the L'"(P)-quantization of random variables. In this 
paper, we focus on the purely quadratic framework (L^ and L^(P)-norms), 
essentially because it is a natural (and somewhat easier) framework for the 
computation of optimized grids for the Brownian motion and for some first 
applications (like the pricing of path-dependent options, see section |9]). But a 
more general and natural framework is to consider the functional quantization 
of random vectors taking values in a separable Banach space {E, | . I^,). Let 
X : {f2,A,F) {E, I |g), such that E \X\l < +00 for some r > 1 (the case 
< r < 1 can also be taken in consideration). 

The TV-level (L''(P), | . |^)-quantization problem for X <E L;(P) reads 

e^^^{X,E) inf F C E, card(r) < 7v| . 

The main examples for (E, | . 1^) are the non-Euclidean norms on R'', the 
functional spaces L^{ijl) :~ LP([0, T], /i((it)), 1 < p < 00, equipped with its 
usual norm, (i?, |.|^) = (C([0, T]), || . ||sup), etc. As concerns, the existence 
of an optimal quantizer, it holds true for reflexive Banach spaces (see Parna 
(90)) and E = L^, but otherwise it may fail even when = 1 (see pPl). 
In finite dimension, the Euclidean feature is not crucial (see [H]). In the 
functional setting, many results originally obtained in a Hilbert setting have 
been extended to the Banach setting either for existence or regularity results 
(see [20]) or for rates see [10], [12], [30], [33]. 

4 Cubature formulae: conditional expectation and 
numerical integration 

Let F : H — > R be a continuous functional (with respect to the norm | . ) 
and let F C H be an iV-quantizer. It is natural to approximate K{F{X)) by 
E(F(A^)). This quantity E{F{X^)) is simply the finite weighted sum 

N 

E {F{X^)) = J2 Fix^mx'^ = X,). 
1=1 

Numerical computation of E {F{X^)) is possible as soon as F{£_) can be com- 
puted at any H and the distribution {¥{X ~ Xi))i<:i<:N of X^ is known. 
The induced quantization error \\X — ^^Ha is used to control the error (see 
below). These quantities related to the quantizer F are also called companion 
parameters. 

Likewise, one can consider a priori the (T(A^)-measurable random variable 
F{X^) as a good approximation of the conditional expectation E(F(A) | X^). 
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4.1 Lipschitz functionals 

Assume that the functional F is Lipschitz continuous on H. Then 

E{F{X) I X^) F{X^)\ < [Fl,E{\X - X^\ \ X^) 

so that, for every real exponent r > 1, 

||E(F(X) I X^) - F{X^)\l < [FIJX - X^t 

(where we applied conditional Jensen inequality to the convex function u <—> 
u"). In particular, using that EF{X) = 'E(E{F{X) \ X^)), one derives (with 
r = 1) that 

EF{X) -EF{X^) < \\E{F{X) I X^) - F{X^)\\, 
<[FIJX-X^\\,. 
Finally, using the monotony of the L''(P)-norms as a function of r yields 

EF{X) -EF{X^)\ < [FIJX - X^W, < [F^JX - X^\\,. (5) 

In fact, considering the Lipschitz functional F{£^) := d{(^,r), shows that 

\\X-X^\\,= sup EF{X)~EF{X^) . (6) 

The Lipschitz functionals making up a characterizing family for the weak 
convergence of probability measures on H, one derives that, for any sequence 
of iV-quantizers F^ satisfying \\X — X'" || ^ ^ as ^ oo. 



l<i<N 



where =4- denotes the weak convergence of probability measures on (iJ, \ .{h) 



4.2 Differentiable functionals with Lipschitz differentials 

Assume now that F is differentiable on H, with a Lipschitz continuous differ- 
ential DF, and that the quantizer F is stationary (see Equation ([4])). 
A Taylor expansion yields 



F{X) - F{X^) - DF{X^).{X - X^) < [DF]^JX - X 



Taking conditional expectation given X yields 
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EiF{X)\X^-F{X^-E(^DF{X^.{X-X^\X^^ < [DF]^,E{\X~X^\^\X 

Now, using that the random variable DF{X^) is (T(X^)-measurable, one has 
E (^DF{X^).{X - X^)j = E (^DF{X^).E{X - X^ \ X^)j = 



< 



[df],^e[\ 



so that 

E{F{X)\X^)- F{X^ 
Then, for every real exponent r > 1, 

E{F{X) I X^) - F{X^)\\^ < [DFIJX - X^\\l 
In particular, when r = 1, one derives like in the former setting 
EF{X) - EF{X^) < [DF]^J\X - X^W^. 



(7) 



In fact, the above inequality holds provided is with Lipschitz differential 
on every Voronoi cell Ci (F) . A similar characterization to ([6]) based on these 
functionals could be established. 

Some variant of these cubature formulae can be found in |43j or [21j for 
functions or functionals F having only some local Lipschitz regularity. 



4.3 Quantized approximation of E{F{X) \ Y) 

Let X and Y be two -ff-valued random vector defined on the same proba- 
bility space {SI, A, P) and : _ff ^ M be a Borel functional. The natural 
idea is to approximate E{F{X) \ Y) by the quantized conditional expectation 
E{F{X) I Y) where X and Y are quantizations of X and Y respectively. 

Let ipj^ : iJ ^ M be a (Borel) version of the conditional expectation i.e. 
satisfying 

E{F{X)\Y)^ip,iY). 

Usually, no closed form is available for the function ip^ but some regularity 
property can be established, especially in a (Feller) Markovian framework. 
Thus assume that both F and ip^ are Lipschitz continuous with Lipschitz 
coefficients [i^]Lip and [lyS^jLip- Then 

E{F{X) I Y)-E{F{X) I Y) = E{F{X) \ Y)-E{F{X) \ Y)+E{F{X)-F{X) \ Y). 

Hence, using that Y is CT(y)-measurable and that conditional expectation is 
an L^-contraction, 
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||E(F(X) I Y) - E{FiX) I = mF{X)\Y) - E{E{F{X)\Y)\Y)\\, 

< y,iY)-E{FiX)\Y)\\, 

= \\^,{Y)~E{^Ay)\y)\\. 

< y,iY)-^,iY)\\,. 

The last inequality follows form the definition of conditional expectation given 
Y as the best quadratic approximation among <7(y)-measurable random vari- 
ables. On the other hand, still using that E( . |(T(y)) is an L^-contraction and 
this time that F is Lipschitz continuous yields 

||E(F(X) - F{X) I < \\F{X) - FiX)\\, < [F]up\\X - X\\,. 

Finally, 

||E(F(X) I Y) - EiFiX) I < [F]ur,\\X - X\l + [^,U\\Y - 

In the non-quadratic case the above inequality remains valid provided 
[^p]up is replaced by 2[(pp]up- 

5 Vector quantization rate (H = W^) 

The fact that e„ {X, M'^) is a non-increasing sequence that goes to as 
goes to oo is a rather simple result established in Proposition [TJ Its rate of 
convergence to is a much more challenging problem. An answer is provided 
by the so-called Zador Theorem stated below. 

This theorem was first stated and established for distributions with com- 
pact supports by Zador (see [S71[5H]). Then a first extension to general prob- 
ability distributions on R'' is developed in [S] . The first mathematically rigor- 
ous proof can be found in '17] , and relies on a random quantization argument 
(Pierce Lemma). 

Theorem 1. (a) Sharp rate. Let r > and X e L''+''(P) for some i] > 0. 

± 

Let (d^) = ip{S^) + v{dC) he the canonical decomposition of the distribu- 
tion of X (v and the Lehesgue measure are singular). Then (ifip^O), 

- + - 

e„^(X,E'') - Xd X ( / ip^{u)du]'' X N-i as N ^ +oo. (8) 
\JR'' J 

where Jr^d & (0,oo). 

(6) NON ASYMPTOTIC UPPER BOUND (see e.g. f^). Let d > 1. There exists 
C'd.r.rj G (0, oo) such that, for every W^-valued random vector X, 

yX > 1, e„_,(X,M'^) < Cd,r,M\\r+vN'^- 
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Remarks. • The real constant J^.d clearly corresponds to the case of the 
uniform distribution over the unit hypercube [0, 1]"^ for which the slightly 
more precise statement holds 

limiV3e„^^(X,M'') ^ inf 7V^e„ „(X,M'^) = Jr^. 

The proof is based on a self-similarity argument. The value of Jr,d depends 
on the reference norm on M"^. When d = 1, elementary computations show 
that Jr,i = (r + l)~~/2. When d = 2, with the canonical Euclidean norm, 

one shows (see [37] for a proof, see also [T7]) that J2_d = \l 1%^ - exact 
value is unknown for d > 3 but, still for the canonical Euclidean norm, one 
has (see |17| ) using some random quantization arguments. 



d d 
V 27re Y 17,08 

• When if = {) the distribution of X is purely singular. The rate ([5]) still 
holds in the sense that linijv TV^ (X, M'') = 0. Consequently, this is not 
the right asymptotics. The quantization problem for singular measures (like 
uniform distribution on fractal compact sets) has been extensively investigated 
by several authors, leading to the definition of a quantization dimension in 
connection with the rate of convergence of the quantization error on these 
sets. For more details we refer to 17, 18 and the references therein. 

• A more naive way to quantize the uniform distribution on the unit hypercube 
is to proceed by product quantization i.e. by quantizing the marginals of the 
uniform distribution. If A'^ = to'^, to > 1, one easily proves that the best 
quadratic product quantizer (for the canonical Euclidean norm on W^) is the 
"midpoint square grid" 



psq,N 



2ii - 1 2id~l 
2to ' ' 2to 



l<ii,...,id "^m 

which induces a quadratic quantization error equal to 




Consequently, product quantizers are still rate optimal in every dimension d. 
Moreover, note that the ratio of these two rates remains bounded as d t oo. 



6 Optimal quantization and QMC 

The principle of Quasi-Monte Carlo method {QMC) is to approximate the 
integral of a function / : [0, 1]'' M with respect to the uniform distribution 
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on [0, l]-*, i.e. / f dXd = fiC\ ^'^)di^ ■ ■ ■ d^'^ (A^ denotes the 

"'[0,1]'' "'[0,1]'* 

Lebesgue measure on [0, 1]''), by tlie uniformly weighted sum 



1 ^ 

fc=i 



of values of / at the points of a so-called low discrepancy A^-tuple (xi , . . . , a;„ ) 
(or set). This A^-tuple can the first N terms of an infinite sequence. 

If / has finite variations denoted V{f) - either in the measure sense (see [H 
IT7] ) or in the Hardy and Krause sense (see ^S] p. 19) - the Koksma-Hlawka 
inequality provides an upper bound for the integration error induced by this 
method, namely 



1 ^ 
N ^ 



fdXa 



[0,1]' 



<V{f)Dzsc*(xi 



where 



Disc* (xi,...,x„) 



sup 

ye[o,i]<' 



1 ^ 

]^I]l{-feelo,yl}-Ad(lO,2/]) 



k=l 



(with |0,yl = nfc=i[0,2/i, y = (y\ ■•■,/) e [0,1]''). The error modu- 
lus Disc*^{xi, . . . ,x„) denotes the discrepancy at the origin of the A^-tuple 
{xi, . . . , xn). For every N > 1, there exists [0, IJ'^-valued A^-tuples a;^-^^ such 
that 

(logTV)'^-! 



TV 



(9) 



where Cd G (0, oo) is a real constant only depending on d. This result can 
be proved using the so-called Hammersely procedure (see e.g. [3H], p. 31). 
When x^-^^ = {xi, . . . ,a;„) is made of the first N terms of a [0, IJ'^-valued 
sequence (a;fe)fe>i, then the above upper bound has be replaced by C'^ ('°g^) 
(C^G (0,oo)). Such a sequence x = {xk)k>i is said to be a sequence with low 
discrepancy (see [3 8) an the references therein for a comprehensive theoretical 
overview, but also [H HT] for examples supported by numerical tests) . When 
one only has Disc*^{xi, . . . , a;„) — > as ~> oo, the sequence is said to be 
uniformly distributed in [0, l]"*. 

It is widely shared by QMC specialists that these rates are (in some sense) 
optimal although this remains a conjecture except when d ~ 1. To be precise 
what is known and what is conjectured is the following: 

- Any [0, IJ'^-valued TV-tuple x^^^ satisfies D*j^{x'-^^) > BdN-'^ (log Nf'^'^^ 
where /3(d) = if d > 2 (see [551 and also [3^ and the references therein), 
/3(1) = and Bd > is a real constant only depending on d; the conjecture is 
that P{d) =d-l. 
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-Any [0,1] ''-valued sequence {xk)k>i satisfies L>^(a;(^)) > BdN-^{\ogNf' '^'^^ 
for infinitely many iV, where (3'{d) = f if d > 2 and (3'{\) = 1 and > 
is a real constant only depending on d; the conjecture is that (3{d) = d. This 
follows from the result for A^-tuple by the Hammersley procedure (see e.g. [3]). 

Furthermore, as concerns the use of Koksma-Hlawka inequality as an error 
bound for QMC numerical integration, the different notions of finite variation 
(which are closely connected) all become more and more restrictive - and 
subsequently less and less "natural" as a regularity property of functions - 
when the dimension d increases. Thus the Lipschitz continuous function / 
defined by /(f \ C^, <^^) := (^^ + + C^) A 1 has infinite variation on [0, 1]^. 

When applying Quasi-Monte Carlo approximation of integrals with "stan- 
dard" continuous functions on [0, l]'', the best known error bound, due to 
Proinov, is given by the following theorem. 

Theorem 2. (Proinov \5Uf ) (a) Assume M'' is equipped with the £°°-norm 
|(m\ . . . ,u'^)|oo := maxi<i<d Let (xi, . . . ,x„)e ([0, 1]'')^. For every con- 
tinuous function / : [0, 1]'' ^ M, 



where ujf{5) := iin]i^^ye[Q,iY,\x-y\^<s\f{^) " f{y)\, ^ (0.1). i^e uniform 
continuity modulus off (with respect to the lo^-norm) and Cd(z (0,oo) is a 
universal constant only depending on d. 

(6) Ifd^ I, Kd^l and if d> 2, Kae [1,4]. 

Remark. Note that if / is Lipschitz continuous, then w/((5) = [f],^^ 6 where 
[/Ibip denotes the Lipschitz coefficient of / (with respect to the ^oo-norm). 

First, this result emphasizes that low discrepancy sequences or sets do 
suffer from the curse of dimensionality when a QMC approximation is imple- 
mented on functions having a "natural" regularity like Lipschitz continuity. 

One also derives from this theorem an inequality between (L^(P),^oo)- 
quantization error of the uniform distribution U{[0, 1]'^) and the discrepancy 
at the origin of a A^-tuple (xi, . . . , namely 



since the function ^ i— > mmi<ck<N \xk ~ £.\ao clearly ^oo-Lipschitz contin- 
uous with Lipschitz coefficient 1. The inequality also follows from the char- 
acterization established in ([6]) (which is clearly still true for non Euclidean 
norms). Then one may derive some bounds for Euclidean norms (and in fact 
any norms) on M.'^ (probably not sharp in terms of constant) since all the 
norms are strongly equivalent. However the bounds for optimal quantization 




II |[/-[/{^i-'^ 



«^ If- 111 <KdiDisc*J, 
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error derived from Zador's Theorem {0{N~i)) and those for low discrepancy 
sets (see (|9])) suggest that overall, optimal quantization provides lower error 
bounds for numerical integration of Lipschitz functions than low discrepancy 
sets, at least for for generic values of N. (However, standard computations 
show that for midpoint square grids (with N — m'^ points) both quantization 
errors and discrepancy behave like ^ = N~ d ) . 



7 Optimal quadratic functional quantization of Gaussian 
processes 

Optimal quadratic functional quantization of Gaussian processes is closely 
related to their so-called Karhunen-Loeve expansion which can be seen in some 
sense as some infinite dimensional Principal Component Analysis {PC A) of a 
(Gaussian) process. Before stating a general result for Gaussian processes, we 
start by the standard Brownian motion: it is the most important example in 
view of (numerical) applications and for this process, everything can be made 
explicit. 



7.1 Brownian motion 



One considers the Hilbert space = L'^{[0, T],dt), (/|.g), = [ f{t)g{t)dt, 

Jo 

1/1^2 — yX/l/)^- The covariance operator C„ of the Brownian motion 
W = (Wt)tg[o.T] is defined on by 



T ^ 

{sAt)f{s)ds 



It is a symmetric positive trace class operator which can be diagonalized in 
the so-called Karhunen-Loeve [K-L) orthonormal basis (e^)„>i of L^, with 
eigenvalues (A„)„>i, given by 

er(0 = /|sin(.(n-i)A), A„ = (-^-^) , n > 1. 

This classical result can be established as a simple exercise by solving the 
functional equation C„{f) — A/. In particular, one can expand W itself on 
this basis so that 



L 

T 

n>l 



Now, the orthonormality of the (K-L) basis implies, using Fubini's Thcrocm, 
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mW\efUW\eY\) = (er|C„(ef )), = A,4. 

where 5}~i denotes the Kronecker symboL Hence the Gaussian sequence 
((W^le^)^ )„>i is pairwise non-correlated which impUes that these random 
variables are independent. The above identity also implies that Var((l/l^|ejf = 
A„. Finally this shows that 

W^^Ev^^-ef (10) 

n>l 

where ^„ := (W\eY)^/ \/X^, n > 1, is an i.i.d. sequence of A/'(0; l)-distributed 
random variables. Furthermore, this K-L expansion converges in a much 
stronger sense since supjgjo.T] l^t ~ Y^lt.=i P-a.s. and 



suplW-t- E V^aef(t)|||, =0(Vb^ 



[0,T] 



l<fc<rj 



(see e.g. [32]). Similar results (with various rates) hold true for a wide class 
of Gaussian processes expanded on "admissible" basis (see e.g. [34]). 

Theorem 3. (JMl (2002) and JMI (2003)) Let , N >1, be a sequence of 
optimal N -quantizers for W . 

(a) For every N > I, span(r^) ^ span{e];^, . . . , e^^^} with d{N) = 
f2{\ogN). Furthermore and W — are independent. 

(6) e„ {W, i2 ) ^ _ l^r" II TV2 1 ^ ^ 

Remark. • The fact, confirmed by numerical experiments (see Section 17.31 
Figure rrS)) . that d{N) ~ logA^ holds as a conjecture. 

• Denoting 77^ the orthogonal projection on span{e]^, . . . , e^}, one derives 
from (a) that W^^ = n^^^i^W) (optimal quantization at level N) and 

W- w'^'^Wl = \\na(N){w) ~ n^w{'\\l + \\w- n^(N){w)\\l 



where ^^(jv) = -ffd(iv)(W^) ~ AA(0; A^) 

fc=i 



n>d{N) + l 
d{N) 



7.2 Centered Gaussian processes 

The above Theorem[3]devoted to the standard Brownian motion is a particular 
case of a more general theorem which holds for a wide class of Gaussian 
processes 
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Theorem 4. (JE^ (2002) and JSS (2004)) Let X = {Xt)te[o.T] be a Gaus- 
sian process with K-L eigensystem (Ajf ,e^)„>i (with Ai > A2 > . . . is non- 
increasing). Let , N >1, be a sequence of quadratic optimal N -quantizers 
for X . Assume 

~ — r as n ^ 00 (b > I). 
(a) span(r^) = span{ef , . . • ,ef^(^-|} and d^{N) = r2(logA^). 
(6) e„ (X, ) ^ 1 1 X - I , ^ ^^b\b~l)-^ (2 log TV)- ^ . 

Remarks. • The above result admits an extension to the case A^ fin) as 
n — > 00 with if regularly varying, index — 6 < — 1 (see |30j ) . In |29| . upper or 
lower bounds are also established when 

(A^<(^(n), n>l) or (A;f > ^(n), n > 1). 

• The sharp asymptotics d^ (N) ~ | log holds as a conjecture. 
Applications to classical (centered) Gaussian processes. 

• Brownian bridge: Xt := Wt-^Wr, te [0,T] and e^ (t) = y/2/T sin (nn^) , 
Xn^i^f, so that e,{X,Ll) ^ T^{logN)-i. 

• Fractional Brownian motion with Hurst constant He (0, 1) 

eNiW",Ll) ~ r^+ic(iI)(log7V)-^ 

where c{H) = ( ^(2g)sm(^H)(i+2H) j *(i±|ff )^ ^nd r(i) denotes the Gamma 
function at < > 0. 

• Some further explicit sharp rates can be derived from the above theorem 
for other classes of Gaussian stochastic processes (see [30], 2004) like the 
fractional Ornstein-Uhlenbeck processes, the Gaussian diffusions, a wide class 
Gaussian stationary processes (the quantization rate is derived from the high 
frequency asymptotics of its spectral density, assumed to be square integrable 
on the real line) , for the m-folded integrated Brownian motion, the fractional 
Brownian sheet, etc. 

• Of course some upper bounds can be derived for some even wider classes of 
processes, based on the above first remark (see e.g. [29], 2002). 

Extensions to r,p 2 When the processes have some self-similarity properties, 
it is possible to obtain some sharp rates in the non purely quadratic case: this 
has been done for fractional Brownian motion in |12| using some quite different 
techniques in which self-similarity properties plays there a crucial role. It leads 
to the following sharp rates, for pE [1, +00] and rE (0, 00) 

e„,,(W^^,L^) ^r^+^c(r,i/)(logiV)-^, c(r,i7)e (0,+^). 
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7.3 Numerical optimization of quadratic functional quantization 

Thanks to the scahng property of Brownian motion, one may focus on the 
normahzed case T = 1. The numerical approach to optimal quantization of 
the Brownian motion is essentially based on Theorem [3] and the remark that 
follows: indeed these results show that quadratic optimal functional quanti- 
zation of a centered Gaussian process reduces to a finite dimensional optimal 
quantization problem for a Gaussian distribution with a diagonal covariance 
structure. Namely the optimization problem at level N reads 

fc>d(A') + l 

(On) = { d(N) 

where Z^^^) = (g)AA(0,Afe). 

fe=i 

Moreover, if :— {(3i , . . . , (3^} denotes an optimal A^-quantizer of Z^^fN^, 
then, the optimal A^-quantizer of W reads — {x^ , . . . , x^} with 

^fW= E (/^D'^rW, ^^l,■■■,N. (11) 

l<i<d{N) 

The good news is that (Oat) is in fact a finite dimensional quantization 
optimization problem for each > 1. The bad news is that the problem is 
somewhat ill conditioned since the decrease of the eigenvalues of W is very 
steep for small values of n: Ai = 0.40528 . . ., A2 = 0.04503 . . . « Ai/10. This is 
probably one reason for which former attempts to produce good quantization 
of the Brownian motion first focused on other kinds of quantizers like scalar 
product quantizers (see |44) and Section 17.41 below) or d-dimensional block 
product quantizations (see [56| and [35|). 

Optimization of the (quadratic) quantization of M*^- valued random vector 
has been extensively investigated since the early 1950's, first in 1-dimension, 
then in higher dimension when the cost of numerical Monte Carlo simula- 
tion was drastically cut down (see |il5j). Recent application of optimal vector 
quantization to numerics turned out to be much more demanding in terms 
of accuracy. In that direction, one may cite [43] , [36j (mainly focused on nu- 
merical optimization of the quadratic quantization of normal distributions). 
To apply the methods developed in these papers, it is more convenient to 
rewrite our optimization problem with respect to the standard d-dimensional 
distribution Af(0; Id) by simply considering the Euclidean norm derived from 
the covariance matrix Diag(Ai, . . . , Xd(N)) 



d(N) 

A^-optimal quantization of A/'(0, 1) 

fc=i 

for the covariance norm \{zi, . . . , Zd(N))\'^ = Sfe^"* -^fe-^fe- 
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The main point is of course that the dimension d{N) is unknown. However 
(see Figure [775]) ■ one clearly verifies on small values of N that the conjecture 
{d{N) ^ logA^) is most likely true. Then for higher values of N one relies 
on it to shift from one dimension to another following the rule d{N) = d, 
N& {e'*,...,e''+i-l}. 

A toolbox for quantization optimization: a short overview 

Here is a short overview of stochastic optimization methods to compute opti- 
mal or at least locally optimal quantizers in finite dimension. For more details 

we refer to [33] and the references therein. Let Z = Af{0; Id)- 

Competitive Learning Vector Quantization (CLVQ). This procedure is a re- 
cursive stochastic approximation gradient descent based on the integral rep- 
resentation of the gradient WD^{x), x S iJ" (temporarily coming back to 
iV-tuple notation) of the distortion as the expectation of a local gradient i.e. 

yx^e il^, Vi^^(x^) =E(Vi?f (x^,C)), Ck i.i.d., Ci =AA(0,/rf) 
so that, starting from x^(0)g (M'')^, one sets 

Vfc > 0, a:^(fc + 1) = x^'ik) - ^Vi?f (a;^(fc), Cfc+i) 

A; -I- 1 

where c£ (0, 1] is a real constant to be tuned. As set, this looks quite formal 
but the operating CLVQ procedure consists of two phases at each iteration: 

{i) Competitive Phase: Search of the nearest neighbor a;^(fc)i*(fc_|_i) of C,k+i 
among the components of x^{k)i, i = 1, . . . , N (using a "winning convention" 
in case of conflict on the boundary of the Voronoi cells). 

(ii) Cooperative Phase: One moves the winning component toward Cfc+i 
using a dilatation i.e. x^{k + l)j.(fc_(.i) — Dilatationfj.^j4__^(a;^(fc)i.(fc_|_i)). 

This procedure is useful for small or medium values of N. For an extensive 
study of this procedure, which turns out to be singular in the world of recursive 
stochastic approximation algorithms, we refer to [40]. For general background 
on stochastic approximation, we refer to [25l [3]. 

The randomized "Lloyd I procedure" . This is the randomization of the station- 
arity based fixed point procedure since any optimal quantizer satisfies Q: 

At every iteration the conditional expectation E(Z | '*'')) is computed using 
a Monte Carlo simulation. For more details about practical aspects of Lloyd I 
procedure we refer to [43j . In |36j , an approach based on genetic evolutionary 
algorithms is developed. 

For both procedures, one may substitute a sequence of quasi-random num- 
bers to the usual pseudo-random sequence. This often speeds up the rate of 
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convergence of the method, although this can only be proved (see [27] ) for a 
very specific class of stochastic algorithm (to which CLVQ does not belong) . 

The most important step to preserve the accuracy of the quantization as 
N (and d{N)) increase is to use the so-called splitting method which finds 
its origin in the proof of the existence of an optimal iV-quantizer: once the 
optimization of a quantization grid of size N is achieved, one specifies the 
starting grid for the size A'^ + 1 or more generally N + v, v > 1, by merging 
the optimized grid of size N resulting from the former procedure with v points 
sampled independently from the normal distribution with probability density 
proportional to tp'^ where ip denotes the p.d.f. of A/'(0;/d). This rather un- 
expected choice is motivated by the fact that this distribution provides the 
lowest in average random quantization error (see [6]). 

As a result, to be downloaded on the website j45j devoted to quantization: 

www . quantize .maths-f i . com 

o Optimized stationary codebooks for W: in practice, the A^-quantizers f3^ 
of the distribution «)fcLTA/'(0; Afc), iV = l up to 10 000 {d{N) runs from 1 up 
to 9). 

o Companion parameters: 

- distribution of W^" : F{W^" = xf ) = f{Z^a{N) = f^!^) R'^^^^). 

- The quadratic quantization error: ||W^ — W H^. 




Fig. 3. Optimized functional quantization of the Brownian motion W forN = 10, 15 
(d{N) —2). Top: depicted in M^. Bottom: the optimized N -quantizer P'^ . 



22 Gilles Pages 




Fig. 4. Optimized functional quantization of the Brownian motion W . The N- 
quantizers . Left: TV = 48 (d{N) =2,). Right: N = 96, d(96) = 4. 




O O.I 0.2 0.3 0.4 0.5 0.6 0."7 O.S 0.9 1 

Brownian motion on [0,1], N=400 points 



Fig. 5. Optimized N -quantizer F of the Brownian motion W with N = 400. The 
grey level of the paths codes their weights. 



0.228 1 

0.226 - I 




20 40 SO 80 100 120 140 160 



Fig. 6. Optimal functional quantization of the Brownian motion. N i— ^ 
log (e^ (VF, L^))^, N£ {6,...,160}. Vertical dashed lines: critical dimensions for 
d{N), ^7, ^ 20, ^ 55, ^ 148. 
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7.4 An alternative: product functional quantization 

Scalar Product functional quantization is a quantization method which pro- 
duces rate optimal sub-optimal quantizers. They were used e.g. in [29j to 
provide exact rate (although not sharp) for a very large class of processes. 
The first attempts to use functional quantization for numerical computation 
with the Brownian motion was achieved with these quantizers (see [44]). We 
will see further on their assets. What follows is presented for the Brownian 
motion but would work for a large class of centered Gaussian processes. 
Let us consider again the expansion of W in its K-L basis : 

n>l 

where (^„)„>i is an i.i.d. sequence A/'(0; l)-distributed random variables (keep 
in mind this convergence also holds a.s. uniformly in t S [0,T]). The idea is 
simply to quantize these (normalized) random coordinates for every n > 1, 
one considers an optimal A^„-quantization of denoted " (A^„ > 1). For 
n > m, set iV„ = 1 and = (which is the optimal 1-quantization) . The 

integer m is called the length of the product quantization. Then, one sets 

m 

n>l n—1 

Such a quantizer takes 11^=1 — values. 

If one denotes by a*^ — {af^, . . . , ajjl} the (unique) optimal quadratic M- 
quantizer of the A/'(0; l)-distribution, the underlying quantizer of the above 
quantization ^(^i.---'^m'P''od) ^g^j^ expressed as follows (if one introduces 
the appropriate multi-indexation): for every multi-index i :— (ii, . . . ,im) G 
n™.i{l,...,A^4,set 

n=l I n=l J 

Then the product quantization VF^^i' ' '^""^''"''-' can be rewritten as 

T77(JVi,...,JV,„,prod) -, (N),,-. 

i 

where the Voronoi cell of x^^^ is given by 



With a-_^i ■— 2 ' ~ —CO, aM+i — +co. 
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Quantization rate by product quantizers 

It is clear that the optimal product quantizer is the solution to the optimal 
integral bit allocation 

min ||| VF- , A^i, . . . , 7V„ > 1, A^i X • • • X iV„ < A^, m> l}. 

_ (12) 
Expanding \\w - W^^^'--^^^P''°'^^\\l = \\\W - W^^^'—^-^^p''°'^^\li\\1 yields 

n>l 

n=l 

since ^ A„ = E ^(W^ | )2 = E /" W^dt^f tdt 

"'0 Jo 



2 ' 



Theorem 5. ('see for every N >1, there exists an optimal scalar prod- 
uct quantizer of size at most N (or at level N), denoted m^(^^p™'^)^ of the 
Brownian motion defined as the solution to the minimization problem 
Furthermore these optimal product quantizers make up a rate optimal se- 
quence: there exists a real constant cw > such that 



(logiV)^ 



Proof (sketch of). By scaling one may assume without loss of generality 
that T = 1. Combining (fT3|) and Zador's Theorem shows 



\n=l "/ n>m+l 

- I ^ n^N^ m I 

\n=l " / 



with n„^« < ^- Setting m := m{N) = [logiV] and Nk = 
k — 1, . . . , TO, yields the announced upper-bound. <> 



(m\N)- 



> 1, 



Remarks. • One can show that the length m{N) of the optimal quadratic 
product quantizer satisfies 



TO(iV)~logiV as iV^+cx). 
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• The most striking fact is that very few ingredients are necessary to make the 
proof work as far as the quantization rate is concerned. We only need the basis 
of on which W is expanded to be orthonormal or the random coordinates 
to be orthogonal in (P) . This robustness of the proof has been used to obtain 
some upper bounds for very wide classes of Gaussian processes by considering 
alternative orthonormal basis of like the Haar basis for processes having 
self-similarity properties (see j29)). or trigonometric basis for stationary pro- 
cesses (see [29] )■ More recently, combined with the non asymptotic Zador's 
Theorem, it was used to provide some connections between mean regularity 
of stochastic processes and quantization rate (see Section [TOl and 33J). 

• Block quantizers combined with large deviations estimates were used to 
provide the sharp rate obtained in Theorem [3] in [3D] . 

• d-dimensional block quantization is also possible, possibly with varying block 
size, providing a constructive approach to sharp rate, see |56 j and [35]. 

• A similar approach can also provide some L''(P)-ratcs for product quanti- 
zation with respect to the sup- norm over [0,T], see [32] • 

How to use product quantizers for numerical computations ? 

For numerics one can assume by a scaling argument that T = 1. To use 
product quantizers for numerics we need to have access to the quantizers (or 
grid) at a given level A'', their weights (and the quantization error). All these 
quantities are available with product quantizers. In fact the first attempts to 
use functional quantization for numerics (path dependent option pricing) were 
carried out with product quantizers (see [44]). 

• The optimal product quantizers (denoted r^^-P™'^^) at level N are exphcit, 
given the optimal quantizers of the scalar normal distribution Af{0; 1). In 
fact the optimal allocation of the size Ni of each marginal has been already 
achieved up to very high values of N. Some typical optimal allocation (and 
the resulting quadratic quantization error) are reported in the table below. 



N 


A^rcc 


Quant. Error 


Opti. Alloc. 


1 


1 


0.7071 


1 


10 


10 


0.3138 


5-2 


100 


96 


0.2264 


12-4-2 


1000 


966 


0.1881 


23-7-3-2 


10 000 


9 984 


0.1626 


26-8-4-3-2-2 


100 000 


97920 


0.1461 


34-10-6-4-3-2-2 



• The weights P(M/(^'P'"°'') — Xi) are explicit too: the normalized coordinates 
^„ of W in its K-L basis are independent, consequently 

P(Vf?(JV,prod) ^ ^ P(|^^") = n = 1, . . . ,m(7V)) 
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m(N) 

= n p(?/"^=«if"^) ■ 

n=l " ' 

ID (tabulated) weights 

• Equation (fT^ shows that the (squared) quantization error of a product 
quantizer can be straightforwardly computed as soon as one knows the eigen- 
values and the (squared) quantization error of the normal distributions for 
the Ni's. 

The optimal allocations up to iV = 12 000 can be downloaded on the web- 
site |45| as well as the necessary 1-dimensional optimal quantizers (including 
the weights and the quantization error) of the scalar normal distribution (up 
to a size of 500 which quite enough for this purpose). 

For numerical purpose we are also interested in the stationarity property 
since such quantizers produce lower (weak) errors in cubature formulas. 

Proposition 2. (see 144V The product quantizers obtained from the K-L ba- 
sis are stationary quantizers (although sub -optimal). 

Proof. Firstly, note that 

n>l 

so that cr(t?^''"'°'') = ct(^^"\ fc > 1). Consequently 

E(VF I W?^'"™'^) = E(W^|ct(c[,^"\ fc> 1)) 

^{W I t?^.P™'') = Ta^E I a(ei'^'=\ k > 1)) 

n>l 
n>l 
n>l 

Remarks. • This result is no longer true for product quantizers based on 
other orthonormal basis. 

• This shows the existence of non optimal stationary quantizers. 



7.5 Optimal vs product quadratic functional quantization (T — 1) 

o (Numerical) Optimized Quantization: By scaling, we can assume with- 
out loss of generality that T = 1. We carried out a huge optimization task in 
order to produce some optimized quantization grids for the Brownian motion 
by solving numerically (Ojv) for = 1 up to = 10 000. 

<^AW,Lir^^-^, iV = l,... ,10000. 
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Fig. 7. Product quantization of the Brownian motion: the Nrcc- quantizer pi^^P'^'^^) ^ 
N = 10; iVrec = 10 and N = 50; A^rec = 12 x 4 = 48. 



jwnier sjr [0,1], N ^ 96 ^ 12 x 4 x 2. Dislorlim ^ 0.0502 



0.1 0.2 0.3 



0.5 0.6 0.7 0.8 0.9 



Fig. 8. Product quantization of the Brownian motion: the Nrcc -quantizer r^'^'^^°'^K 
N = 100; Nrcc = 12 X 4 X 2 = 96. 

This value (see Figure [HJleft)) is significantly greater than the theoretical 
(asymptotic) bound given by Theorem [3] which is 



\inY\ogNe^{W,LlY = — = 0.2026 



N 



Our guess, supported by our numerical experiments, is that in fact N i-^ 
log A^e„(T/F, L^)^ is possibly not monotonous but unimodal. 

o Optimal Product quantization: as displayed on Figure [HKright), 
one has approximately 



0.245 



logA^ 



o Optimal ^-dimensional block product quantization: let us 
briefly mention this approach developed in [56j in which product quantiza- 
tion is achieved by quantizing some marginal blocks of size 1, 2 or 3. By this 
approach, the corresponding constant is approximately 0.23, i.e. roughly in 
between scalar product quantization and optimized numeric quantization. 
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The conclusion, confirmed by our numerical experiments on option pricing 
(see Section ini), is that 

- Optimal quantization is significantly more accurate on numerical exper- 
iments but is much more demanding since it needs to keep off line or at least 
to handle large files (say 1 GB for N = 10 000). 

- Both approaches are included in the option pricer Premia (MATHFI 
Project, Inria). An online benchmark is available on the website |45) . 




Fig. 9. Numerical quantization rates. Top (Optimal quantization). Z/ine+ + +; 
log AT i-> {\\W -W^W^y^. Dashed line: logiV ^ log iV/0.2194. Solid line: \ogN ^ 
logA'"/0.25. Bottom (Product quantization). Line+ + +: logTV ( min \\W — 

l<k<N 

^^fc.prod||2pi^ So/id line: logN ^ log7V/0.25. 



8 Constructive functional quantization of diffusions 
8.1 Rate optimality for Scalar Brownian difTusions 

One considers on a probability space (12, A, P) an homogenous Brownian dif- 
fusion process: 

dXt = b{Xt)dt + i?(Xf ) dWt, Xa = xoe M, 

where b and are continuous on M with at most linear growth (i.e. \b(^x)\ + 
\cr{x)\ < C(l -I- |x|)) so that at least a weak solution to the equation exists. 
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To devise a constructive way to quantize the diffusion X, it seems natural 
to start from a rate optimal quantization of the Brownian motion and to 
obtain some "good" (but how good?) quantizers for the diffusion by solving 
an appropriate ODE. So let = {w^,---,w^), > 1, be a sequence 
of stationary rate optimal A^-quantizers of W. One considers the following 
(non-coupled) Integral Equations: 

dxf \i) ^ (^6(a;f \i)) - i^^?'(a;f \t))) dt + ^it,xf\t))dwf'it). (15) 

Set 

N 
fc=l 

— (TV) 

The process X'^ is a non-Voronoi quantizer (since it is defined using the 
Voronoi diagram of W). What is interesting is that it is a computable quan- 
tizer (once the above integral equations have been solved) since the weights 
P(VF^ = wf^) are known. The Voronoi quantization defined by x*^^' induces 
a lower quantization error but we have no access to its weights for numerics. 

— (TV) 

The good news is that is already rate optimal. 

Theorem 6. (JSTI (2006)) Assume that b is dijferentiable, ?? is positive twice 
differ entiable and that b' — b^ — is bounded. Then 

eAX,Ll) < II =0((logiV)-i). 

If furthermore, i? > eo > 0, then e„(X, L^) w (logA^)^^. 

Remarks. • For some results in the non homogenous case, we refer to |31) . 
Furthermore, the above estimates still hold true for the {L^ (P) , )-quantization, 
1 < r,p < +00 provided \\\W -W^" {^pjlr = 0{(logN)-i). 

• This result is closely connected to the Doss-Sussman approach (see e.g. jT3]) 
and in fact the results can be extended to some classes multi-dimensional dif- 
fusions (whose diffusion coefficient is the inverse of the gradient of a diffeo- 
morphism) which include several standard multi-dimensional financial models 
(including the Black-Scholes model). 

• A sharp quantization rate e„^(Ar, i^) ~ c(logA^)~5 for scalar elhptic diffu- 
sions is established in [TOl[Tl] using a non constructive approach, 1 < p < cxd. 

Example: Rate optimal product quantization of the Ornstein-Uhlenbeck pro- 
cess. 

dXt = -kXtdt + MWt, Xo xo- 
One solves the non-coupled integral (linear) system 

Xi{t) ~ xq — k / Xi{s) ds + dwf (t), 
Jo 
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where := {w^ , . . . , w^}, iV > 1 is a rate optimal sequence of quantizers 



e>i 

If is optimal for W then := {(3(^Y, i = 1, . . . ,N, 1 < £ < d{N) with 
the notations introduced in If is an optimal product quantizer (and 
A^i, . . . , Ni, . . . denote the optimal size allocation), then zuij = where 
i := {ii, . . . . . .) £ Y[e>i{^^ ■ ■ • i ^e}- Elementary computations show that 

rp2 



with ci = 

[71 

and ^iit) 7 1 fe(^-l/2) sin ( 7r(^-l/2)- ) + fc (cos (7r(^-l/2);l 



8.2 Multi-dimensional diffusions for Stratanovich SDE's 

The correcting term —^■d'd' coming up in the integral equations suggest to 
consider directly some diffusion in the Stratanovich sense 

dXt = b{t,Xt)dt + ^{t,Xt) o dWt Xo^xoeW^, te [0,T]. 

(see e.g. [51] for an introduction) where W = {W^, . . . , W^) is a d-dimensional 
standard Brownian Motion. 

In that framework, we need to introduce the notion of p-variation: a con- 
tinuous function x : [0, T] M'' has finite p-variations if 

Varp^[o,T]{x) := sup \x{t,) - x{t,+i)\pj ,0 < to < ti < . . . < tk < T, k > 1^ < 

Then dp{x,x') — \x{Q) — x'{Q)\ + Varp^\^Q^T]{x — x') defines a distance on 
the set of functions with finite p-variations. It is classical background that 
Varp^\^Q^T]{W{uj)) < +00 ¥{duj)-a.s. for every p > 2. 

One way to quantize W at level (at most) N is to quantize each component 
W at level [ v^J . One shows (see [30]) that ||T^-(I?i'L , . . . , W^'^-L )||, = 
0((logiV)-^). 

Let CJ([0, T] xM'') r > 0, denote the set of [rj-times differentiable bounded 
functions / : [0, T] x M.'^ — M.'^ with bounded partial derivatives up to order 
[rj and whose partial derivatives of order [r\ are (r — [rj )-H61der. 



-oo. 
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Theorem 7. (see ;46I) Let b,^e C^+"{[0,T] x W^) (a > 0) and let = 
{w^ , . . . ,w^} , N > 1, be a sequence of N- quantizers of the standard d- 

' piV 

dimensional Brownian motion W such that \\W — W \\^ ^ as N ^ oo. 
Let 

N 

1=1 

where, for every i(z {1, . . . , iV}, a;[^' is solution to 

ODE, = dx[^\t) ^b{t,xf\t))dt + dit,xf\t))dwf^{t), xf\0)^x. 
Then, for every pCz (2,oo), 

Farpjo.T] {X''^"' -X)^0 as N ^ oo. 

Remarks. • The keys of this resuhs are the Kolmogorov criterion, stationarity 
(in a sUghtly extended sense) and the connection with rough paths theory 
(see |28| for an introduction to rough paths theory, convergence in p- variation, 
etc). 

• In that general setting we have no convergence rate although we conjecture 
that X^'' ' remains rate optimal if W'" is. 

• There are also some results about the convergence of stochastic integrals of 

the form / g{W^) dB^ / giWs) ° dBs, with some rates of convergence 

Jo Jo 
when W = B or W and B independent (depending on the regularity of the 

function g, see |46) ) . 



9 Applications to path-dependent option pricing 

The typical functionals F defined on (L^, | . \]^2 ) for which K{F{W)) can 
be approximated by the cubature formulae ([5]), ([7]) are of the form F{lu) := 

vi^j f{t,uj(t))dtj l{a;ec([o,T],R)} where / : [0, T] x M ^ M is locally Lips- 

chitz continuous in the second variable, namely 

V<e [0,r], ^u,veR, \f{t,u)- f{t,v)\<C}\u-v\{l+g{\u\)+g{\v\)) 

(with g : 1R+ M-)- is increasing, convex and (/(supjgjQ j^j \Wt\) G L^(P)) and 
(/3 : M ^ M is Lipschitz continuous. One could consider for oj some cadlag 
functions as well. A classical example is the Asian payoff in a Black-Scholes 
model 

F{uj) = exp(-rr) sq exp(crw(i) + (r - a'^ /2)t)dt - . 



E 
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9.1 Numerical integration (II): log-Romberg extrapolation 

Let F : — > M be a 3 times I . |l2 -differentiable functional with bounded 

differentials. Assume W'-^\ N >1, is a sequence of a rate-optimal stationary 
quantizations of the standard Brownian motion W . Assume furthermore that 

E(i?2F(t?W).(W^-t?W)«2^ as N^oo (16) 

and 

E|W^-VF(^)|3^ =o((logiV)-i) . (17) 
Then, a higher order Taylor expansion yields 

F{W) = F(1?W) + DF{W^'^'^).{W - W^W) + ^D'^F(w'-'^^).iW - 1?W)« 

+ -D^{C).{W - iyW)®3, ce (W^-^\W), 
6 

EF(W^) =EF(t?W) + ^^^+o((log7V)-i+=) . 

Then, one can design a log-Romberg extrapolation by considering TV, iV', 
N < N' {e.g. N' k4N),so that 

mw)) = log^^'xE(F(t?(^')))-logAr'xE(mW)) / 
^ ^ " logiV'-logiV ^ B y 

For practical implementation, it is suggested in [55] to replace logiV by the 
more consistent "estimator" — VF*^^-*!!"^. 

In fact Assumption (|16p holds true for optimal product quantization when 
F is polynomial function F, dPF = 2. Assumption (fT7|) holds true in that 
case as well (see [U). As concerns optimal quantization, these statements 
are still conjectures. However, given that W and W — W are independent 
(see [55] ) , ([T5| is equivalent to the simple case where D^F{W'-^^) is constant. 

Note that the above extrapolation or some variants can be implemented 
with other stochastic processes in accordance with the rate of convergence of 
the quantization error. 



9.2 Asian option pricing in a Heston stochastic volatility model 

In this section, we will price an Asian call option in a Heston stochastic volatil- 
ity model using some optimal (at least optimized) functional quantization of 
the two Brownian motions that drive the diffusion. This model has already 
been considered in [44j in which functional quantization was implemented for 
the first time with some product quantizations of the Brownian motions. The 
Heston stochastic volatility model was introduced in [25^ to model stock price 
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dynamics. Its popularity partly comes from the existence of semi-closed forms 
for vanilla European options, based on inverse Fourier transform and from its 
ability to reproduce some skewness shape of the implied volatility surface. We 
consider it under its risk-neutral probability measure. 

dSt = St{rdt + y^dWl), Sq = sq > 0, (risky asset) 
dvt = k{a-vt)dt + 'dy^dWt, > with d<W^ ,W^>t= pdt, pe [-1,1]. 

where -d, k, a such that i?^/ (Aak) < 1. We consider the Asian Call payoff with 
maturity T and strike K. No closed form is available for its premium 

AsCalF^'^* = e-'''^E Ssds - . 

We briefly recall how to proceed (see [33] for details): first, one projects 
on W"^ so that = pW^ + y/l- p^W^ and 

St = so exp ({r - ^vt)t + p V^sdW^^ exp ( ^1 - ^ V^sdW} 

= so exp (t ({r -Pf) + vtif - i)) + ^{vt ~ vo)j exp [ V^sdW} 

The chaining rule for conditional expectations yields 

AsCalF""*(so,ii') = e-'^'^E^E^^i^ Ssds~K^ k(ly^^ < < < r)j 

Combining these two expressions and using that and W"^ are indepen- 
dent show that AsCall^'^^'(so, K) is a functional of {Wl ,Vt) (as concerns the 
squared volatility process w, only and v^ds are involved). 

Let — {w^ , . . . , w^} be an A^-quantizer of the Brownian motion. One 
solves for i = 1, . . . , A^, the differential equations for (vt) 

dy.it) = fc U - y,{t) ~ — \dt + d^v~{f)dw^{t), y,{Q) = wq, (18) 

using e.g. a Runge-Kuta scheme. Let y^'^ denote the approximation of yi 
resulting from the resolution of the above ODEi (1/n is the time discretization 
parameter of the scheme). Set the (non-Voronoi) A^-quantization of (vt, St) by 



~n,N \ " n,N 
^t 



^yr'^(i)lc,(r«)(W^') (19) 

i 

St'''= E <f(Olc.(r«)(W?^)lc,(r«)(W^') (20) 
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with s^f(t) = .oexp (t({r - ^) + - i)) + f (y-^W - .0)) 

X exp (VT^^* du;f (s)) 

and y]'^it)= / y-'^{s)ds. 
Jo 

Note this formula requires the computation of a quantized stochastic integral 

J \J yJ'^ i^) (which corresponds to the independent case). 

The weights of the product cells {W^ G Cj(r^), G Cj (r^)} is given by 

owing to the independence. For practicaljmplementations different sizes of 
quantizers can be considered to quantize and W^. 

We follow the guidelines of the methodology introduced in [44] : we compute 
the crude quantized premium for two sizes N and N' , then proceed a space 
Romberg log-extrapolation. Finally, we make a iiT-linear interpolation based 
on the (Asian) forward moneyness sqc^'^ ^~rT — ~ soe'^^ (like in [44]) and the 
Asian Call-Put parity formula 

AsianCalP""*(so, K) = AsianPut^""*(so, K) + so^—^^ - Ke'"'^. 

rT 

The anchor strikes Kmm and -ftTmax of the extrapolation are chosen symmetric 
with respect to the forward moneyness. At i^max, the Call is deep out-of-the- 
money: one uses the Romberg extrapolated FQ computation; at ifmin the 
Call is deep in-the-money: on computes the Call by parity. In between, one 
proceeds a linear interpolation in K (which yields the best results, compared 
to other extrapolations like the quadratic regression approach). 

o Parameters of the Heston model: sq = 100, k = 2, a = 0.01, p = 0.5, 
vq = 10%, d = 20%. 

o Parameters of the option portfolio: T = 1, K = 99, • • • , 111 (13 strikes). 

o The reference price has been computed by a 10^ trial Monte Carlo sim- 
ulation (including a time Romberg extrapolation of the Euler scheme with 
2n = 256). 

o The differential equations \18\) are solved with the parameters of the 
quantization cubature formulae At — 1/32, with couples of quantization levels 
(iV,M) = (400,100), (1000,100), (3200,400). 

Functional Quantization can compute a whole vector (more than 10) op- 
tion premia for the Asian option in the Heston model with 1 cent accuracy 
in less than 1 second (implementation in C on a 2.5 GHz processor). 
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Fig. 10. N -quantizer of the Heston squared volatility process {vt) (N = 400^ result- 
ing from an (optimized) N -quantizer of W . 




Fig. 11. Quantized diffusions based on optimal functional quantization: Pricing 
by K-Interpolated-log-Romberg extrapolated- FQ prices as a function of K: absolute 
error with {N,M) = (400, 100), (N,M) = (1000, 100), {N,M) = (3200,400). T=l, 
so = 50, Ke {99, ...,111}. fc = 2, a = 0.01, p = 0.5, i9 = 0.1. 



Further numerical tests carried out or in progress with the B-S model and 
with the SABR model (Asian, vanilla European options) show the same effi- 
ciency. Furthermore, recent attempt to quantize the volatility process and the 
asset dynamics at different level of quantizations seem very promising in two 
directions: reduction of the computation time and increase of the robustness 
of the method to parameter change. 

9.3 Comparison: optimized quantization vs (optimal) product 
quantization 

The comparison is balanced and probably needs some further in situ exper- 
iments since it may depend on the modes of the computation. However, it 
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NX=3200, NY=400. INTERPOLATION. Dt = 1/32, 1/64, 1/12S 



0,002 - 



-0,004 - 




Fig. 12. Quantized diffusions based on optimal functional quantization: Pricing 
by K -Interpolated-log-Romberg extrapolated- FQ price as a function of K : conver- 
gence as At ^ with {N,M) = (3200,400) (absolute error). T = 1, so = 50, 
Ke {99,. ..,111}. fe = 2, a = 0.01, p = 0.5, i9 = 0.1. 
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Fig. 13. Quantized diffusions based on optimal product quantization: Pricing by 
K -linear interpolation of Romberg log-extrapolations as un function of K (absolute 
error) with {M,N)= (96,966), (966,9984). T=l, so = 50, k = 2, a = 0.01, p = 0.5, 
i? = 0.1. Ke {44, ...,56}. 



seems that product quantizers (as those implemented in |44| ) are from 2 up 
to 4 times less efficient than optimal quantizers within our range of application 
(small values of N). On the other hand, the design of product quantizer from 
1-dim scalar quantizers is easy and can be made from some light elementary 
"bricks" (the scalar quantizer up to iV = 35 and the optimal allocation rules). 
Thus, the whole set of data needed to design all optimal product quantizers 
up to = 10 000 is approximately 500 KB whereas one optimal quantizer 
with size 10 000 « 1 MB. . . 
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10 Universal quantization rate and mean regularity 

The following theorem points out the connection between functional quanti- 
zation rate and mean regularity oi 1 Xt from [0,T] to L''(P). 

Theorem 8. (JMl (2005)) Let X = {Xt)t(z[o^T\ be a stochastic process. If 
there is r* g (0,cxd) and aG (0, 1] such that 

Xo e L-'{f), \\Xt - X,|L..(p) < Cx\t - sr, 

for some positive real constant Cx > 0, then 

Vp,re (0,r*), eN^r{X,Ll) = 0((log TV)--^). 

The proof is based on a constructive approach which involves the Haar basis 
(instead of K-L basis) , the non asymptotic version Zador Theorem and prod- 
uct functional quantization. Roughly speaking, we use the unconditionality 
of the Haar basis in every (when 1 <p < oo) and its wavelet feature i.e. 
its ability to "code" the path regularity of a function on the decay rate of its 
coordinates. 

Examples (see [33J): • d-dimensional Ito processes (includes d-dim diffu- 
sions with sublinear coefficients) with a = 1/2. 

• General Levy process X with Levy measure v with square integrable big 
jumps. If X has a Brownian component, then a — 2, otherwise if P{X) > 
where f3{X) := inf {6 : J \y\'^iy{dy)< +00} e (0, 2) (Blumenthal-Gctoor index 
of X), then a — f3*(X). This rate is the exact rate i.e. 

eNAX,LP)-{\ogN)-'' 

for many classes of Levy processes like symmetric stable processes, Levy pro- 
cesses having a Brownian component, etc (see |33j for further examples). 

• When X is a compound Poisson processes, then /3(X) = and one shows, 
still with constructive methods, that 

ejv(X)-0(e-(i°sA')^)^ ^?e(0,l), 
which is in-between the finite and infinite dimensional settings. 

11 About lower bounds 

In this overview, we gave no clue toward lower bounds although most of the 
rates we mentioned are either exact («) or sharp ('^) (we tried to emphasize 
the numerical aspects). Several approaches can be developed to get some lower 
bounds. Historically, the first one was to rely on subadditivity property of the 
quantization error derived from self-similarity of the distribution: this works 
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with the uniform distribution over [0,1]'' but also in an infinite dimensional 
framework (see e.g. [T2] for the fractional Brownian motion). 

A second approach consists in pointing out the connection with the 
Shannon-Kolmogorov entropy (see e.g. |29j ) using that the entropy of a ran- 
dom variable taking at most N values is at most log A^. 

A third connection can be made with small deviation theory (see [9], [19] 
and [33]). Thus, in [19j . a connection is established between (functional) quan- 
tization and small ball deviation for Gaussian processes. In particular this 
approach provides a method to derive a lower bound for the quantization rate 
from some upper bound for the small deviation problem. A careful reading of 
the proof of Theorem 1.2 in [19] shows that this small deviation lower bound 
holds for any unimodal (w.r.t. 0) non zero process. To be precise: assume that 

is L^-unimodal i.e. there exists a real £o > such that 

yxeLP, Vee(0,£o], F(l^-2:|i. <£)<P(|XLp <£). 

For centered Gaussian processes (or processes "subordinated" to Gaussian 
processes) this follows from the Anderson Inequality (when p > 1). If 

G(- log(P(|X|i. < e))) = nil/e) as £ ^ 

for some increasing unbounded function G : (0, oo) — > (0, oo), then 

Vol, liniinf G(log(c7V))e„„(X,iP) > 0, re(0,c5o). (21) 

This approach is efficient in the non quadratic case as emphasized in |33j 
where several universal bounds are shown to be optimal using this approach. 

Acknowledgement. I thank S. Graf, H. Luschgy J. Printcms and B. 
Wilbertz for all the fruitful discussions and collaborations we have about 
functional quantization. 
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